A Neural Probabilistic Language Model

https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf

2003

Abstract

Traditional but very successful approaches based on n-grams obtain generalization by concatenating very short overlapping sequences seen in the training set.

We propose to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.

次元の呪いを解決したい